altman-demetrio-tarasenko Stockholm 2023 replication

A comment on Cavaillé and Ferwerda (2022)

Micah Altman12 (MIT) , Hans Gaebler (Harvard) , Georgy Tarasenko (Cornell)

Abstract

Cavaillé and Ferwerda (Cavaillé and Ferwerda 2023) examine the effect of granting immigrants access to the welfare state in Austria during 2006 on the support for far-right parties. In their preferred specification, they find that – the interaction of the proportion of non-eu residents in a municipality with the proportion of population in that municipality has an effect on the change in vote share for far right parties of +0.67 (p < .05) supporting the main substantive claim that increasing welfare benefits to immigrants increases support for far-right parties. Moderating this, based on a subset of data from Vienna, they find that – percentage of population in rental housing and the percentage of population in public housing are both substantive and significant predictors of change in far-right party vote share with magnitude of .03 and .09 (respectively) and p-values of < .05. Based on this estmate they conclude that support was increased by the the high proportion of public housing beneficiaries or low-end rentals within districts.

We first reproduce the paper’s main and secondary analyses, and subject these to computational robustness tests.

We then conduct several types of conceptual replication – examining the robustness of results to selection of covariates, reweighting, and the sensitivity of results to resampling and outliers.

We find research is computationally replicable but the evidence for the main causal claims are overstated as it sensitive to a small set of observations, does not explain most of the variation in outcomes, and does not out-compete simpler causal explanations.

Introduction

Cavaillé and Ferwerda (Cavaillé and Ferwerda 2023) tested the effect of granting immigrants access to the welfare state in Austria during 2006 on the support for far-right parties. We focus on their two main claims, as described in the abstract (pg 19):

we show that the reform increased support for far-right parties with welfare chauvinist platforms. Electoral ward data suggest that this response was concentrated in districts with a high proportion of public housing beneficiaries or low-end rentals. Our findings provide novel evidence that distributional conflict can accelerate the rise of far-right parties in countries with substantial in-kind welfare programs

In their preferred specification, using OLS with clustered standard errors, they find that – the interaction of the proportion of non-eu residents in a municipality with the proportion of population in that municipality has an effect on the change in vote share for far right parties of +0.67 (p < .05) (pg 26, Table 1, col 1) supporting the main substantive claim that increasing welfare benefits to immigrants increases support for far-right parties (pg. 19). Moderating this, based on a subset of data from Vienna, they find that – percentage of population in rental housing and the percentage of population in public housing are both substantive and significant predictors of change in far-right party vote share with magnitude of .03 and .09 (respectively) and p-values of < .05. (pg 30, Table 2, col 1). Based on this estimate they conclude that support was accelerated by the the high proportion of public housing beneficiaries or low-end rentals within districts. (pg 19.)

We obtained the published replication data set (Cavaille 2023) from the Harvard Dataverse archive, where it had been deposited for public use by the authors. The replication data set included the final processed data used to produce the paper results, along with R code to replicate figures and tables. The data was sufficient to check computational repoducibility but posed challenges for conceptual reproducibility because it provided neither copies of the source data, nor links to or citations of the original data sources (which were described generally). Further although the replication data did include additional measures not used in publication, these were largely undocumented, and hence difficult to reliably interpret.

We first reproduced the paper’s main and secondary analyses, and subject these to computational robustness tests. The computational reproductions showed both the main and secondary results to be reproducible. As an unintended consequence reproducing the results revealed the substantively poor fit of these models: the primary model has an R-squared of .0356. This finding informed further replication analysis.

We then conducted several types of conceptual replication – examining the robustness of results to selection of covariates, reweighting, and the sensitivity of results to resampling and outliers.

Prompted by the unexectedly poor fit of the primary model, we performed a conceptual replication that applied the authors linear model with clustered errors with alternate covariates, and to models that did not include the interaction effects that are central to the papers proposed causal explanation. This reveals that the interaction term does not provide substantial improvement in explanatory power over a more naive baseline.

The authors suggest, in their first analysis of nation-wide election data, that voters in municipalities with high proportions of residents living in public housing as well as comparatively high proportions of third-country nationals voted more for far-right parties as a result of the demand shock on public housing. We undertake a conceptual replication of this result using their data on the city of Vienna. In particular, we look at census tracts—as a proxy for neighborhood—to see if tracts with a higher proportion of third-country nationals and higher proportions of individuals living in public housing tend to vote for far-right parties at higher rates. We find that implementing this conceptual replication increases the magnitude of the main point estimate for the interaction between these two factors, but the estimate is no longer statistically significant at the 5% level.

The authors fit the models in both of their main analyses at the level of administrative units—i.e., all municipalities are equally weighted in their nation-wide analysis, and all tracts are equally weighted in their analysis of Vienna. However, it is unclear if this weighting is substantively appropriate. To test the robustness of their conclusions to this, we reweight their model so that each administrative unit has weight equal to the number of voters it contains. Implementing this robustness check in their primary analysis increases the magnitude of their main point estimate for the the interaction of the proportion of residents living in public housing with the proportion of third-country nationals residing in a municipality, which remains significant at the 5% level. Implementing this robustness check in their secondary analysis has no effect on the magnitude or the statistical significance of the main point estimates.

Next, to better understand the authors main claims, how well their models fit the data, and to search for potential anomalies not evident in low-dimensional summaries, we attempted to visually replicate their primary and secondary analyses. Based on this graphical exploration, we carried out an outlier analysis.

We assessed the robustness of the findings from the Austrian sample by examining potential outliers in the key variables. We observed that the previously significant interaction effect between the share of non-EU residents and the share of public housing loses its significance when we exclude abnormally large values of key variables. Covariate valance analysis highlights substantial differences between observations with outliers and those without. When we reevaluate the main argument using the sample of outliers, we find that all the initial effects become notably stronger in both magnitude and significance. This may suggest that the author’s primary argument may not generalize to all Austrian districts within the sample.

Finally, given the substantive importance placed on the interaction term in the authors’ primary analysis, as well as the generally low goodness-of-fit of the models considered, we evaluated the main models of the primary and secondary analyses using out-of-sample tests of predictive accuracy. We find in both cases that the models with and without the interaction term have essentially the same mean squared error, suggesting that the substantive interpretation of the interaction effect may not be appropriate.

Reproducibility

Simple Direct Reproduction (Calibration)

As a baseline check of the model, and our understanding of it we conducted a simple computational replication, using the authors’ supplied data and code, and compared these to published results.

Reproduction of Table 1 - Model 1 (Primary Result)

Loading required namespace: haven
Loading required namespace: estimatr
term estimate std.error statistic p.value conf.low conf.high df outcome
comp
(Intercept) 0.03684645 0.0009130496 40.3553683 1.803168e-271 0.0350559924 0.03863691 2369 d_rr_06
dv_pop_01 0.02422834 0.0121140293 2.0000234 4.561171e-02 0.0004731441 0.04798354 2369 d_rr_06
pct_noneu_06 -0.02263235 0.0305294844 -0.7413275 4.585684e-01 -0.0824996238 0.03723493 2369 d_rr_06
dv_pop_01:pct_noneu_06 0.66842772 0.1658860281 4.0294395 5.767257e-05 0.3431308814 0.99372456 2369 d_rr_06
original
(Intercept) 0.04000000 0.0000000000 NA 5.000000e-02 NA NA NA NA
dv_pop_01 0.02000000 0.0300000000 NA 1.000000e+00 NA NA NA NA
pct_noneu_06 -0.02000000 0.0300000000 NA 5.000000e-02 NA NA NA NA
dv_pop_01:pct_noneu_06 0.67000000 0.1700000000 NA 5.000000e-02 NA NA NA NA
r.squared adj.r.squared df.residual res_var nobs
comp
0.03560054 0.03437926 2369 0.000802094 2373

Reproduction of Table 2 - Model 1 (Secondary Result)

Code
vienna_authors.df <- 
  haven::read_dta("authors replication materials/vienna_final.dta")

m2.formula<- formula("dv ~  (pctrental + pctpublic_w_zsp)")

results_m2_computational.lmr <-
  estimatr::lm_robust(m2.formula,   data = vienna_authors.df, 
                      clusters = tract_key)
Warning in eval(quote({: Some observations have missingness in the cluster
variable(s) but not in the outcome or covariates. These observations have been
dropped.
Code
tidy_m2_original.df <-
  structure(
    list(
      term = c(
        "(Intercept)",
        "pctrental",
        "pctpublic_w_zsp"
      ),
      estimate = c(0.04, 0.03, 0.09),
      std.error = c(0.01, 0.01, 0.01),
      p.value = c(0.05, 0.05,  0.05),
      repl_id = c("original", "original", "original")
    ),
    row.names = c(NA,-3L),
    class = "data.frame"
  )

repro_m2.df <- tidy_results(list(comp=results_m2_computational.lmr)) %>% 
   bind_rows(tidy_m2_original.df)

repro_m2_summary.df <- tidy_summary(list(comp=results_m2_computational.lmr))

repro_m2.df %>% group_by(repl_id) %>% gt()
term estimate std.error statistic p.value conf.low conf.high df outcome
comp
(Intercept) 0.04219616 0.01073662 3.930116 2.263033e-04 0.020710245 0.06368207 58.74038 dv
pctrental 0.03212893 0.01442834 2.226793 2.916168e-02 0.003355367 0.06090249 70.39755 dv
pctpublic_w_zsp 0.08703017 0.01330078 6.543237 8.609872e-09 0.060499496 0.11356085 69.53739 dv
original
(Intercept) 0.04000000 0.01000000 NA 5.000000e-02 NA NA NA NA
pctrental 0.03000000 0.01000000 NA 5.000000e-02 NA NA NA NA
pctpublic_w_zsp 0.09000000 0.01000000 NA 5.000000e-02 NA NA NA NA
Code
repro_m2_summary.df %>% group_by(repl_id) %>% gt()
r.squared adj.r.squared df.residual res_var nobs
comp
0.1508865 0.1499319 1779 0.001921443 1782

Summary - Direct Reproducibility

The computational reproductions showed both the main and secondary results to be reproducible.

As an unintended consequence, reproducing the results revealed the substantively poor fit of these models: the primary model has an R-squared of .0356. This finding informed the replication analysis.

Computational robustness

Generally, even with a fixed statistical model family and specification results may vary with the estimation algorithm used and specific software’s implementation of it. (Altman, Gill, and McDonald 2004) We evaluate the computational robustness of the model by using alternative algorithms and implementations.

Primary Result

Code
requireNamespace("arm")
Loading required namespace: arm
Code
results_m1_statlm.lm <-
  stats::lm(m1.formula, data = austria_authors.df)
results_m1_statglm.glm <-
  stats::glm(m1.formula, data = austria_authors.df)
results_m1_statbayeslm.glm <-
  arm::bayesglm(m1.formula, data = austria_authors.df, 
                prior.scale=Inf, prior.df=Inf)

#Note: could also add nls, bayesglm, mle2 -- would require re-expressing current formula in different syntax, and renaming results matrix

ml.ls <- list(lmrobust = results_m1_computational.lmr,
              lm=results_m1_statlm.lm,
              glm=results_m1_statglm.glm,
              bayes=results_m1_statbayeslm.glm)

alt_est_m1.df <- tidy_results(ml.ls)
Warning: The `tidy()` method for objects of class `bayesglm` is not maintained by the broom team, and is only supported through the `glm` tidier method. Please be cautious in interpreting and reporting broom output.

This warning is displayed once per session.
Code
alt_est_m1_summary.df <- tidy_summary(ml.ls)

alt_est_m1.df  %>% group_by(repl_id) %>% gt()
term estimate std.error statistic p.value conf.low conf.high df outcome
lmrobust
(Intercept) 0.03684645 0.0009130496 40.3553683 1.803168e-271 0.0350559924 0.03863691 2369 d_rr_06
dv_pop_01 0.02422834 0.0121140293 2.0000234 4.561171e-02 0.0004731441 0.04798354 2369 d_rr_06
pct_noneu_06 -0.02263235 0.0305294844 -0.7413275 4.585684e-01 -0.0824996238 0.03723493 2369 d_rr_06
dv_pop_01:pct_noneu_06 0.66842772 0.1658860281 4.0294395 5.767257e-05 0.3431308814 0.99372456 2369 d_rr_06
lm
(Intercept) 0.03684645 0.0009202448 40.0398359 3.410573e-268 NA NA NA NA
dv_pop_01 0.02422834 0.0120977011 2.0027228 4.532081e-02 NA NA NA NA
pct_noneu_06 -0.02263235 0.0303293021 -0.7462205 4.556083e-01 NA NA NA NA
dv_pop_01:pct_noneu_06 0.66842772 0.1832818503 3.6469935 2.710478e-04 NA NA NA NA
glm
(Intercept) 0.03684645 0.0009202448 40.0398359 3.410573e-268 NA NA NA NA
dv_pop_01 0.02422834 0.0120977011 2.0027228 4.532081e-02 NA NA NA NA
pct_noneu_06 -0.02263235 0.0303293021 -0.7462205 4.556083e-01 NA NA NA NA
dv_pop_01:pct_noneu_06 0.66842772 0.1832818503 3.6469935 2.710478e-04 NA NA NA NA
bayes
(Intercept) 0.03684637 0.0009202444 40.0397659 3.416280e-268 NA NA NA NA
dv_pop_01 0.02422834 0.0120977011 2.0027228 4.532081e-02 NA NA NA NA
pct_noneu_06 -0.02263235 0.0303293021 -0.7462205 4.556083e-01 NA NA NA NA
dv_pop_01:pct_noneu_06 0.66842772 0.1832818503 3.6469935 2.710478e-04 NA NA NA NA
Code
alt_est_m1_summary.df %>% group_by(repl_id) %>% gt()
r.squared adj.r.squared df.residual res_var nobs sigma statistic p.value df aic
lmrobust
0.03560054 0.03437926 2369 0.000802094 2373 NA NA NA NA NA
lm
0.03560054 0.03437926 2369 NA 2373 0.02832126 29.15032 1.668541e-18 3 NA
glm
NA NA 2369 NA NA NA NA NA NA -10175.14
bayes
NA NA 2369 NA NA NA NA NA NA -10175.14
Code
rm(ml.ls,results_m1_statlm.lm,results_m1_statglm.glm,results_m1_statbayeslm.glm)

Secondary Result

Code
requireNamespace("arm")

results_m2_statlm.lm <-
  stats::lm(m2.formula, data = vienna_authors.df)
results_m2_statglm.glm <-
  stats::glm(m2.formula, data = vienna_authors.df)
results_m2_statbayeslm.glm <-
  arm::bayesglm(m2.formula, data = vienna_authors.df, 
                prior.scale=Inf, prior.df=Inf)

#Note: could also add nls, bayesglm, mle2 -- would require re-expressing current formula in different syntax, and renaming results matrix

ml.ls <- list(lmrobust = results_m2_computational.lmr,
              lm=results_m2_statlm.lm,
              glm=results_m2_statglm.glm,
              bayes=results_m2_statbayeslm.glm)

alt_est_m2.df <- tidy_results(ml.ls)
alt_est_m2_summary.df <- tidy_summary(ml.ls)

alt_est_m2.df  %>% group_by(repl_id) %>% gt()
term estimate std.error statistic p.value conf.low conf.high df outcome
lmrobust
(Intercept) 0.04219616 0.010736620 3.930116 2.263033e-04 0.020710245 0.06368207 58.74038 dv
pctrental 0.03212893 0.014428340 2.226793 2.916168e-02 0.003355367 0.06090249 70.39755 dv
pctpublic_w_zsp 0.08703017 0.013300783 6.543237 8.609872e-09 0.060499496 0.11356085 69.53739 dv
lm
(Intercept) 0.04117427 0.006278568 6.557909 7.125926e-11 NA NA NA NA
pctrental 0.03375491 0.007815795 4.318807 1.654994e-05 NA NA NA NA
pctpublic_w_zsp 0.08807870 0.007752433 11.361427 6.254012e-29 NA NA NA NA
glm
(Intercept) 0.04117427 0.006278568 6.557909 7.125926e-11 NA NA NA NA
pctrental 0.03375491 0.007815795 4.318807 1.654994e-05 NA NA NA NA
pctpublic_w_zsp 0.08807870 0.007752433 11.361427 6.254012e-29 NA NA NA NA
bayes
(Intercept) 0.04117408 0.006278568 6.557878 7.127361e-11 NA NA NA NA
pctrental 0.03375491 0.007815795 4.318807 1.654994e-05 NA NA NA NA
pctpublic_w_zsp 0.08807870 0.007752433 11.361427 6.254012e-29 NA NA NA NA
Code
alt_est_m2_summary.df %>% group_by(repl_id) %>% gt()
r.squared adj.r.squared df.residual res_var nobs sigma statistic p.value df aic
lmrobust
0.1508865 0.1499319 1779 0.001921443 1782 NA NA NA NA NA
lm
0.1492961 0.1483424 1784 NA 1787 0.04389489 156.5435 2.303283e-63 2 NA
glm
NA NA 1784 NA NA NA NA NA NA -6095.888
bayes
NA NA 1784 NA NA NA NA NA NA -6095.888
Code
rm(ml.ls,results_m2_statlm.lm,results_m2_statglm.glm,results_m2_statbayeslm.glm)

Summary - Computation Robustness of Reproducibility

The computational reproductions showed both the main and secondary results to be robust to alternative choices of algorithm and software implementation.

Replication:

Conceptual Replication: Alternate OLS Specification (covariate robustness)

Prompted by the unexectedly poor fit of the primary model, we performed a conceptual replication that applied the authors linear model with clustered errors with alternate covariates, and to models that did not include the interaction effects that are central to the papers proposed causal explanation.

Alternate Covariates & Model 1

Code
#NOTE: could also explore  alternate methods for computing robust standard errors (e.g. sensemaker, lmtest, sandwich)

results_m1_clustered.lmr <-  estimatr::lm_robust(m1.formula, data = austria_authors.df, clusters = bezirk)

results_m1_interactionsonly.lmr <-
  estimatr::lm_robust(formula("d_rr_06 ~ dv_pop_01:pct_noneu_06 -dv_pop_01 -pct_noneu_06"),
                      data = austria_authors.df)

results_m1_kitchensink.lmr <-
  estimatr::lm_robust(
    formula(
      "d_rr_06 ~ dv_pop_01:pct_noneu_06 +educ_tertiary +avg_income +lab_pct_manufact_01 +lab_pct_unemp +welfare_cap_06 +health_cap_06 +education_cap_06 +foreignborn_delta+citizen_eu_growth_pct +vacancy_01_public"
    ),
    data = austria_authors.df
  )

results_m1_kitchensinkmain.lmr <-
  estimatr::lm_robust(
    formula(
      "d_rr_06 ~ dv_pop_01+pct_noneu_06 +educ_tertiary +avg_income +lab_pct_manufact_01 +lab_pct_unemp +welfare_cap_06 +health_cap_06 +education_cap_06 +foreignborn_delta+citizen_eu_growth_pct +vacancy_01_public"
    ),
    data = austria_authors.df
  )

# vacancy has high missingbess
results_m1_kitchensink_novacancy.lmr <-
  estimatr::lm_robust(
    formula(
      "d_rr_06 ~ dv_pop_01:pct_noneu_06 +educ_tertiary +avg_income +lab_pct_manufact_01 +lab_pct_unemp +welfare_cap_06 +health_cap_06 +education_cap_06 +foreignborn_delta+citizen_eu_growth_pct "
    ),
    data = austria_authors.df
  )

results_m1_kitchensinkmain_novacancy.lmr <-
  estimatr::lm_robust(
    formula(
      "d_rr_06 ~ dv_pop_01+pct_noneu_06 +educ_tertiary +avg_income +lab_pct_manufact_01 +lab_pct_unemp +welfare_cap_06 +health_cap_06 +education_cap_06 +foreignborn_delta+citizen_eu_growth_pct "
    ),
    data = austria_authors.df
  )

results_m1_mainonly.lmr <-
  estimatr::lm_robust(formula("d_rr_06 ~ dv_pop_01 + pct_noneu_06 "), data = austria_authors.df)

results_m1_euonly.lmr <-
  estimatr::lm_robust(formula("d_rr_06 ~ pct_noneu_06 "), data = austria_authors.df)

results_m1_poponly.lmr <-
  estimatr::lm_robust(formula("d_rr_06 ~ dv_pop_01 "), data = austria_authors.df)

ml.ls <- list(
  author_model1 = results_m1_computational.lmr,
  author_clusterederr = results_m1_clustered.lmr,
  loaded_model = results_m1_kitchensink.lmr,
  loaded_main = results_m1_kitchensinkmain.lmr,
  loaded_nv = results_m1_kitchensink_novacancy.lmr,
  loaded_main_nv = results_m1_kitchensinkmain_novacancy.lmr,
  interaction = results_m1_interactionsonly.lmr,
  maineffects = results_m1_mainonly.lmr,
  single_eu = results_m1_euonly.lmr,
  single_pop = results_m1_poponly.lmr
)

alt_var_m1.df <- tidy_results(ml.ls)
alt_var_m1_summary.df <- tidy_summary(ml.ls)

alt_var_m1.df  %>% group_by(repl_id) %>% gt()
term estimate std.error statistic p.value conf.low conf.high df outcome
author_model1
(Intercept) 3.684645e-02 9.130496e-04 40.35536829 1.803168e-271 3.505599e-02 3.863691e-02 2369.00000 d_rr_06
dv_pop_01 2.422834e-02 1.211403e-02 2.00002340 4.561171e-02 4.731441e-04 4.798354e-02 2369.00000 d_rr_06
pct_noneu_06 -2.263235e-02 3.052948e-02 -0.74132752 4.585684e-01 -8.249962e-02 3.723493e-02 2369.00000 d_rr_06
dv_pop_01:pct_noneu_06 6.684277e-01 1.658860e-01 4.02943954 5.767257e-05 3.431309e-01 9.937246e-01 2369.00000 d_rr_06
author_clusterederr
(Intercept) 3.684645e-02 1.867787e-03 19.72732624 6.702775e-28 3.311026e-02 4.058265e-02 59.94905 d_rr_06
dv_pop_01 2.422834e-02 2.417527e-02 1.00219521 3.219979e-01 -2.456168e-02 7.301837e-02 41.93177 d_rr_06
pct_noneu_06 -2.263235e-02 6.136912e-02 -0.36879050 7.140825e-01 -1.463730e-01 1.011083e-01 43.26508 d_rr_06
dv_pop_01:pct_noneu_06 6.684277e-01 2.479705e-01 2.69559334 1.107872e-02 1.634470e-01 1.173408e+00 32.19299 d_rr_06
loaded_model
(Intercept) 1.707617e-02 7.699244e-03 2.21790203 2.673331e-02 1.971909e-03 3.218043e-02 1304.00000 d_rr_06
educ_tertiary 8.034463e-04 4.487743e-02 0.01790313 9.857189e-01 -8.723641e-02 8.884330e-02 1304.00000 d_rr_06
avg_income 1.117378e-06 3.987979e-07 2.80186544 5.155941e-03 3.350224e-07 1.899734e-06 1304.00000 d_rr_06
lab_pct_manufact_01 3.502320e-03 1.068684e-02 0.32772251 7.431741e-01 -1.746297e-02 2.446761e-02 1304.00000 d_rr_06
lab_pct_unemp 3.355020e-02 7.572057e-02 0.44307903 6.577821e-01 -1.149973e-01 1.820977e-01 1304.00000 d_rr_06
welfare_cap_06 1.504246e-05 9.551844e-06 1.57482250 1.155400e-01 -3.696204e-06 3.378112e-05 1304.00000 d_rr_06
health_cap_06 7.354016e-07 5.350671e-06 0.13744102 8.907034e-01 -9.761463e-06 1.123227e-05 1304.00000 d_rr_06
education_cap_06 -2.009510e-05 6.817849e-06 -2.94742592 3.261307e-03 -3.347026e-05 -6.719952e-06 1304.00000 d_rr_06
foreignborn_delta -3.006769e-02 6.515503e-02 -0.46147910 6.445319e-01 -1.578878e-01 9.775247e-02 1304.00000 d_rr_06
citizen_eu_growth_pct 4.687133e-02 1.985481e-02 2.36070479 1.838674e-02 7.920478e-03 8.582219e-02 1304.00000 d_rr_06
vacancy_01_public -2.651650e-02 7.421029e-03 -3.57315706 3.655604e-04 -4.107496e-02 -1.195804e-02 1304.00000 d_rr_06
dv_pop_01:pct_noneu_06 6.234773e-01 1.444432e-01 4.31641773 1.705715e-05 3.401107e-01 9.068438e-01 1304.00000 d_rr_06
loaded_main
(Intercept) 1.660115e-02 7.763011e-03 2.13849332 3.266224e-02 1.371779e-03 3.183052e-02 1303.00000 d_rr_06
dv_pop_01 4.057408e-02 1.084747e-02 3.74042014 1.916907e-04 1.929367e-02 6.185449e-02 1303.00000 d_rr_06
pct_noneu_06 1.285873e-02 3.349882e-02 0.38385609 7.011477e-01 -5.285881e-02 7.857626e-02 1303.00000 d_rr_06
educ_tertiary 1.200251e-03 4.512472e-02 0.02659853 9.787840e-01 -8.732480e-02 8.972530e-02 1303.00000 d_rr_06
avg_income 1.040712e-06 4.067235e-07 2.55877108 1.061648e-02 2.428078e-07 1.838617e-06 1303.00000 d_rr_06
lab_pct_manufact_01 3.916529e-03 1.063020e-02 0.36843409 7.126094e-01 -1.693765e-02 2.477071e-02 1303.00000 d_rr_06
lab_pct_unemp 3.550574e-02 7.695067e-02 0.46140908 6.445822e-01 -1.154550e-01 1.864665e-01 1303.00000 d_rr_06
welfare_cap_06 1.541928e-05 9.860061e-06 1.56381186 1.181046e-01 -3.924052e-06 3.476261e-05 1303.00000 d_rr_06
health_cap_06 3.271295e-06 6.049464e-06 0.54075777 5.887670e-01 -8.596461e-06 1.513905e-05 1303.00000 d_rr_06
education_cap_06 -2.062670e-05 6.872121e-06 -3.00150385 2.737613e-03 -3.410833e-05 -7.145065e-06 1303.00000 d_rr_06
foreignborn_delta -8.783673e-03 6.733893e-02 -0.13043975 8.962387e-01 -1.408883e-01 1.233209e-01 1303.00000 d_rr_06
citizen_eu_growth_pct 5.729275e-02 2.003809e-02 2.85919151 4.314880e-03 1.798229e-02 9.660321e-02 1303.00000 d_rr_06
vacancy_01_public -2.600573e-02 7.540595e-03 -3.44876421 5.811276e-04 -4.079877e-02 -1.121270e-02 1303.00000 d_rr_06
loaded_nv
(Intercept) 2.466387e-02 7.246167e-03 3.40371305 6.764997e-04 1.045394e-02 3.887380e-02 2233.00000 d_rr_06
educ_tertiary -2.081675e-02 2.711079e-02 -0.76784022 4.426633e-01 -7.398173e-02 3.234823e-02 2233.00000 d_rr_06
avg_income 8.817519e-07 3.486483e-07 2.52905845 1.150522e-02 1.980432e-07 1.565461e-06 2233.00000 d_rr_06
lab_pct_manufact_01 2.637047e-03 8.124972e-03 0.32456078 7.455439e-01 -1.329624e-02 1.857034e-02 2233.00000 d_rr_06
lab_pct_unemp 1.203452e-02 5.756899e-02 0.20904523 8.344320e-01 -1.008598e-01 1.249289e-01 2233.00000 d_rr_06
welfare_cap_06 1.863581e-05 7.621652e-06 2.44511430 1.455740e-02 3.689545e-06 3.358207e-05 2233.00000 d_rr_06
health_cap_06 -9.213061e-07 3.801267e-06 -0.24236814 8.085172e-01 -8.375694e-06 6.533082e-06 2233.00000 d_rr_06
education_cap_06 -3.250227e-05 5.056048e-06 -6.42839513 1.571852e-10 -4.241732e-05 -2.258723e-05 2233.00000 d_rr_06
foreignborn_delta -7.441627e-03 5.115881e-02 -0.14546131 8.843598e-01 -1.077654e-01 9.288218e-02 2233.00000 d_rr_06
citizen_eu_growth_pct 7.217839e-02 1.684078e-02 4.28593061 1.897038e-05 3.915318e-02 1.052036e-01 2233.00000 d_rr_06
dv_pop_01:pct_noneu_06 7.432948e-01 1.223423e-01 6.07553134 1.448731e-09 5.033781e-01 9.832114e-01 2233.00000 d_rr_06
loaded_main_nv
(Intercept) 2.345531e-02 7.186705e-03 3.26370802 1.116248e-03 9.361982e-03 3.754863e-02 2232.00000 d_rr_06
dv_pop_01 5.584260e-02 9.781646e-03 5.70891634 1.288350e-08 3.666052e-02 7.502467e-02 2232.00000 d_rr_06
pct_noneu_06 -2.170304e-02 2.734349e-02 -0.79371855 4.274437e-01 -7.532438e-02 3.191830e-02 2232.00000 d_rr_06
educ_tertiary -2.525183e-02 2.752201e-02 -0.91751409 3.589724e-01 -7.922324e-02 2.871958e-02 2232.00000 d_rr_06
avg_income 8.252228e-07 3.505775e-07 2.35389575 1.866407e-02 1.377308e-07 1.512715e-06 2232.00000 d_rr_06
lab_pct_manufact_01 1.906930e-03 8.183877e-03 0.23301061 8.157745e-01 -1.414188e-02 1.795574e-02 2232.00000 d_rr_06
lab_pct_unemp 4.038334e-02 5.643949e-02 0.71551568 4.743653e-01 -7.029604e-02 1.510627e-01 2232.00000 d_rr_06
welfare_cap_06 2.368010e-05 7.935170e-06 2.98419525 2.874109e-03 8.119011e-06 3.924118e-05 2232.00000 d_rr_06
health_cap_06 3.619708e-06 4.050116e-06 0.89372962 3.715629e-01 -4.322679e-06 1.156210e-05 2232.00000 d_rr_06
education_cap_06 -3.259706e-05 5.088113e-06 -6.40651206 1.810189e-10 -4.257499e-05 -2.261913e-05 2232.00000 d_rr_06
foreignborn_delta 3.084490e-02 5.228670e-02 0.58991863 5.553049e-01 -7.169075e-02 1.333805e-01 2232.00000 d_rr_06
citizen_eu_growth_pct 8.537299e-02 1.723695e-02 4.95290551 7.859501e-07 5.157086e-02 1.191751e-01 2232.00000 d_rr_06
interaction
(Intercept) 3.729305e-02 6.214137e-04 60.01324146 0.000000e+00 3.607448e-02 3.851162e-02 2371.00000 d_rr_06
dv_pop_01:pct_noneu_06 8.250730e-01 7.286844e-02 11.32277579 5.536443e-29 6.821805e-01 9.679654e-01 2371.00000 d_rr_06
maineffects
(Intercept) 3.507456e-02 7.805412e-04 44.93621311 1.681207e-319 3.354395e-02 3.660518e-02 2370.00000 d_rr_06
dv_pop_01 5.553859e-02 9.065141e-03 6.12661062 1.048046e-09 3.776216e-02 7.331502e-02 2370.00000 d_rr_06
pct_noneu_06 4.613258e-02 2.447440e-02 1.88493177 5.956130e-02 -1.860879e-03 9.412604e-02 2370.00000 d_rr_06
single_eu
(Intercept) 3.644139e-02 7.787135e-04 46.79690783 0.000000e+00 3.491436e-02 3.796842e-02 2373.00000 d_rr_06
pct_noneu_06 1.182574e-01 2.188413e-02 5.40379634 7.175702e-08 7.534340e-02 1.611714e-01 2373.00000 d_rr_06
single_pop
(Intercept) 3.571047e-02 7.070819e-04 50.50400684 0.000000e+00 3.432391e-02 3.709703e-02 2372.00000 d_rr_06
dv_pop_01 6.324194e-02 7.615920e-03 8.30391353 1.670642e-16 4.830739e-02 7.817649e-02 2372.00000 d_rr_06
Code
alt_var_m1_summary.df %>% group_by(repl_id) %>% gt()
r.squared adj.r.squared df.residual res_var nobs
author_model1
0.03560054 0.03437926 2369 0.0008020940 2373
author_clusterederr
0.03560054 0.03437926 2369 0.0008020940 2373
loaded_model
0.07259563 0.06477244 1304 0.0007556057 1316
loaded_main
0.07073519 0.06217711 1303 0.0007577025 1316
loaded_nv
0.07318432 0.06903378 2233 0.0007574832 2244
loaded_main_nv
0.07273932 0.06816949 2232 0.0007581864 2244
interaction
0.03344442 0.03303676 2371 0.0008032092 2373
maineffects
0.03018599 0.02936758 2370 0.0008062570 2373
single_eu
0.01288978 0.01247381 2373 0.0008197834 2375
single_pop
0.02864185 0.02823234 2372 0.0008068610 2374
Code
rm(
  ml.ls,
  results_m1_clustered.lmr,
  results_m1_kitchensink.lmr,
  results_m1_interactionsonly.lmr,
  results_m1_kitchensink_novacancy.lmr,
  results_m1_kitchensinkmain_novacancy.lmr,
  results_m1_mainonly.lmr,
  results_m1_euonly.lmr,
  results_m1_poponly.lmr,
  results_m1_kitchensinkmain.lmr
)

Observe that none of the alternative models fit the data well. Moreover, models without interaction effects provide only slightly lower explanatory power.

Alternate Covariates & Model 2

Code
#m2.formula<- formula("dv ~  (pctrental + pctpublic_w_zsp)")
#results_m2_computational.lmr <-
#  estimatr::lm_robust(m2.formula,   data = vienna_authors.df, 
#                      clusters = tract_key)


# m1.formula  <- formula("d_rr_06 ~ dv_pop_01*pct_noneu_06")

results_m2_interaction.lmr <-
  estimatr::lm_robust(formula("dv ~ pctrental*pctpublic_w_zsp"),
                      data = vienna_authors.df,
                      clusters = tract_key)
Warning in eval(quote({: Some observations have missingness in the cluster
variable(s) but not in the outcome or covariates. These observations have been
dropped.
Code
results_m2_m1model.lmr <-
  estimatr::lm_robust(formula("dv ~ pctpublic_w_zsp*pctforeign"),
                      data = vienna_authors.df,
                      clusters = tract_key)

results_m2_rental.lmr <-
  estimatr::lm_robust(formula("dv ~ pctrental"),
                      data = vienna_authors.df,
                      clusters = tract_key)
Warning in eval(quote({: Some observations have missingness in the cluster
variable(s) but not in the outcome or covariates. These observations have been
dropped.
Code
results_m2_housing.lmr <-
  estimatr::lm_robust(formula("dv ~ pctpublic_w_zsp"),
                      data = vienna_authors.df,
                      clusters = tract_key)
Warning in eval(quote({: Some observations have missingness in the cluster
variable(s) but not in the outcome or covariates. These observations have been
dropped.
Code
ml.ls <- list(author_model2 = results_m2_computational.lmr,
              interaction = results_m2_interaction.lmr,
              m1spec = results_m2_m1model.lmr,
              housing_only = results_m2_housing.lmr,
              rental_only = results_m2_rental.lmr
              )

alt_var_m2.df <- tidy_results(ml.ls)
alt_var_m2_summary.df <- tidy_summary(ml.ls)

alt_var_m2.df  %>% group_by(repl_id) %>% gt()
term estimate std.error statistic p.value conf.low conf.high df outcome
author_model2
(Intercept) 0.04219616 0.010736620 3.930116 2.263033e-04 0.020710245 0.06368207 58.74038 dv
pctrental 0.03212893 0.014428340 2.226793 2.916168e-02 0.003355367 0.06090249 70.39755 dv
pctpublic_w_zsp 0.08703017 0.013300783 6.543237 8.609872e-09 0.060499496 0.11356085 69.53739 dv
interaction
(Intercept) 0.04093776 0.010419537 3.928942 2.265245e-04 0.020087882 0.06178763 58.94209 dv
pctrental 0.03898312 0.014020002 2.780536 6.921966e-03 0.011034410 0.06693184 71.94745 dv
pctpublic_w_zsp 0.10581306 0.012985175 8.148759 5.606402e-12 0.079952083 0.13167404 76.22568 dv
pctrental:pctpublic_w_zsp -0.13404200 0.017275356 -7.759145 7.373887e-12 -0.168314805 -0.09976919 100.23985 dv
m1spec
(Intercept) 0.08447235 0.005546008 15.231199 1.276299e-28 0.073478806 0.09546589 107.66008 dv
pctpublic_w_zsp 0.04470112 0.007389251 6.049479 1.352690e-07 0.029891626 0.05951061 54.81840 dv
pctforeign -0.10894320 0.028262648 -3.854671 2.601661e-04 -0.165343808 -0.05254260 67.77636 dv
pctpublic_w_zsp:pctforeign 0.08442037 0.052470547 1.608910 1.129095e-01 -0.020546127 0.18938687 59.72958 dv
housing_only
(Intercept) 0.06740029 0.002945504 22.882426 1.454410e-53 0.061585426 0.07321515 168.44151 dv
pctpublic_w_zsp 0.05838478 0.004264515 13.690838 7.123166e-25 0.049926537 0.06684302 102.40444 dv
rental_only
(Intercept) 0.10827086 0.003098036 34.948221 1.318514e-61 0.102131670 0.11441005 110.63627 dv
pctrental -0.04697894 0.004860515 -9.665424 9.095117e-17 -0.056600464 -0.03735742 122.44937 dv
Code
alt_var_m2_summary.df %>% group_by(repl_id) %>% gt()
r.squared adj.r.squared df.residual res_var nobs
author_model2
0.15088652 0.14993192 1779 0.001921443 1782
interaction
0.18204469 0.18066456 1778 0.001851977 1782
m1spec
0.17655648 0.17516709 1778 0.001864403 1782
housing_only
0.14284274 0.14236119 1780 0.001938556 1782
rental_only
0.09071395 0.09020311 1780 0.002056451 1782
Code
rm(
  ml.ls,
  results_m2_interaction.lmr,
  results_m2_m1model.lmr,
  results_m2_housing.lmr,
  results_m2_rental.lmr
)

Observe as withthe primary finding that none of the alternative models fit the data well, and that models without interaction effects provide only slightly lower explanatory power. Further, when the orignal model applied to the country as a whole (as above) for the main findingt is fit to this subset, the interaction term is not significant.

Proportion of third-country nationals

In light of the eye tests below, we investigate the counterintuitive relationship between the proportion of third-country nationals living in a particular ward and the change in far-right vote share. In particular, if the mechanism driving change support for far-right parties is direct competition between third-country and Austrian nationals for public housing, then, assuming individuals have a preference for continuing to live near their current residence, we would expect that voters living in areas with more third-country nationals would tend to be more supportive of far-right parties.

Code
results_m2_pctforeign <-
  estimatr::lm_robust(dv ~ pctforeign * pctpublic_w_zsp, data = vienna_authors.df,
                      clusters = tract_key)
tidy_results(list("pctforeign" = results_m2_pctforeign)) %>%
  gt()
repl_id term estimate std.error statistic p.value conf.low conf.high df outcome
pctforeign (Intercept) 0.08447235 0.005546008 15.231199 1.276299e-28 0.07347881 0.09546589 107.66008 dv
pctforeign pctforeign -0.10894320 0.028262648 -3.854671 2.601661e-04 -0.16534381 -0.05254260 67.77636 dv
pctforeign pctpublic_w_zsp 0.04470112 0.007389251 6.049479 1.352690e-07 0.02989163 0.05951061 54.81840 dv
pctforeign pctforeign:pctpublic_w_zsp 0.08442037 0.052470547 1.608910 1.129095e-01 -0.02054613 0.18938687 59.72958 dv
Code
tidy_summary(list("pctforeign" = results_m2_pctforeign)) %>%
  gt()
repl_id r.squared adj.r.squared df.residual res_var nobs
pctforeign 0.1765565 0.1751671 1778 0.001864403 1782
Code
results_m2_pctforeign_w_controls <-
  estimatr::lm_robust(dv ~ pctforeign * pctpublic_w_zsp + lab_pct_pensioners + educ_tertiary,
                      data = vienna_authors.df, clusters = tract_key)
tidy_results(list("pctforeign_controls" = results_m2_pctforeign_w_controls)) %>%
  gt()
repl_id term estimate std.error statistic p.value conf.low conf.high df outcome
pctforeign_controls (Intercept) 0.10041371 0.014092499 7.1253300 5.748448e-09 0.07205103 0.12877638 46.245200 dv
pctforeign_controls pctforeign -0.11667692 0.028902936 -4.0368537 1.380951e-04 -0.17433530 -0.05901855 69.094413 dv
pctforeign_controls pctpublic_w_zsp 0.04521248 0.007337279 6.1620230 8.009815e-08 0.03051726 0.05990770 56.542699 dv
pctforeign_controls lab_pct_pensioners -0.04279600 0.038845279 -1.1017041 2.754733e-01 -0.12067484 0.03508284 54.037982 dv
pctforeign_controls educ_tertiary -2.43190135 3.065725682 -0.7932547 4.867196e-01 -12.30890544 7.44510274 2.936475 dv
pctforeign_controls pctforeign:pctpublic_w_zsp 0.07929712 0.051862911 1.5289754 1.315638e-01 -0.02445960 0.18305383 59.568735 dv
Code
tidy_summary(list("pctforeign_controls" = results_m2_pctforeign_w_controls)) %>%
  gt()
repl_id r.squared adj.r.squared df.residual res_var nobs
pctforeign_controls 0.1852827 0.182989 1776 0.001846723 1782

It’s possible that these results are driven by homophily, i.e., voters residing in tracts with higher percentages of third-country nationals have political preferences that are more friendly to third-country nationals. To test this, we see if voters living in areas with few third-country nationals were also more likely to vote for far-right parties in 2002. To ensure that the controls make sense, we reverse-engineer pctforeign02.

Code
placebo_df <- vienna_authors.df %>%
  mutate(pctforeign02 = pctforeign / (1 + pctforeign_delta))
results_m2_pctforeign_placebo <-
  estimatr::lm_robust(farright_share2002 ~ pctforeign02 * pctpublic_w_zsp, data = placebo_df,
                      clusters = tract_key)
tidy_results(list("pctforeign_placebo" = results_m2_pctforeign_placebo)) %>%
  gt()
repl_id term estimate std.error statistic p.value conf.low conf.high df outcome
pctforeign_placebo (Intercept) 0.0746101488 0.001937988 38.49876040 2.582460e-64 0.070768098 0.07845220 106.47809 farright_share2002
pctforeign_placebo pctforeign02 0.0132415358 0.010541431 1.25614215 2.133133e-01 -0.007789136 0.03427221 68.79969 farright_share2002
pctforeign_placebo pctpublic_w_zsp 0.0208641718 0.003509215 5.94553772 2.068608e-07 0.013829025 0.02789932 54.13661 farright_share2002
pctforeign_placebo pctforeign02:pctpublic_w_zsp 0.0009347354 0.025369795 0.03684442 9.707320e-01 -0.049817589 0.05168706 59.70385 farright_share2002
Code
tidy_summary(list("pctforeign_placebo" = results_m2_pctforeign_placebo)) %>%
  gt()
repl_id r.squared adj.r.squared df.residual res_var nobs
pctforeign_placebo 0.07523995 0.07367962 1778 0.0005060312 1782
Code
results_m2_pctforeign_w_controls_placebo <-
  estimatr::lm_robust(farright_share2002 ~ pctforeign02 * pctpublic_w_zsp + lab_pct_pensioners + educ_tertiary,
                      data = placebo_df, clusters = tract_key)
tidy_results(list("pctforeign_controls_placebo" = results_m2_pctforeign_w_controls_placebo)) %>%
  gt()
repl_id term estimate std.error statistic p.value conf.low conf.high df outcome
pctforeign_controls_placebo (Intercept) 5.571588e-02 0.004487867 12.414780601 2.761652e-16 0.04668209 0.06474968 45.964636 farright_share2002
pctforeign_controls_placebo pctforeign02 3.043494e-02 0.010465076 2.908239028 4.868129e-03 0.00956258 0.05130731 69.918663 farright_share2002
pctforeign_controls_placebo pctpublic_w_zsp 1.894769e-02 0.003455488 5.483361021 1.043481e-06 0.01202502 0.02587036 55.819240 farright_share2002
pctforeign_controls_placebo lab_pct_pensioners 6.278750e-02 0.013749355 4.566578195 2.923445e-05 0.03521969 0.09035532 53.827738 farright_share2002
pctforeign_controls_placebo educ_tertiary -7.451271e-01 0.449382445 -1.658113458 1.977681e-01 -2.19230744 0.70205328 2.938633 farright_share2002
pctforeign_controls_placebo pctforeign02:pctpublic_w_zsp -7.472873e-05 0.024815850 -0.003011331 9.976074e-01 -0.04972218 0.04957273 59.512565 farright_share2002
Code
tidy_summary(list("pctforeign_controls_placebo" = results_m2_pctforeign_w_controls_placebo)) %>%
  gt()
repl_id r.squared adj.r.squared df.residual res_var nobs
pctforeign_controls_placebo 0.1003251 0.09779222 1776 0.000492859 1782
Code
rm(placebo_df)

We find the opposite—namely, that the percentage of third-country nationals is, if anything, weakly positively associated with support for far-right parties in previous elections, which casts doubt on the hypothesis that far-right voters are residentially segregated from third-country nationals.

This suggests that the evidence from Vienna specifically for the proposed mechanism—namely, that direct competition with third-country nationals for public housing resources pushes voters to support far-right parties—is weak.

Reweighting

If we had individual rather than aggregate data, we would most likely fit a model at the voter level, regressing whether or not an individual cast a vote for a far-right party in the 2006 federal election against whether they lived in public housing, the number of third-country nationals living in their neighborhood, etc. This paper uses aggregate data at the tract and municipality level as an approximation to the individual data. However, unless we weight the regression by the number of voters in a municipality, we are implicitly weighting voters differently in different municipalities or tracts. As a result, we rerun both models reweighting the observations by the number of voters in the administrative unit.

Code
results_m1_reweight <- estimatr::lm_robust(
  d_rr_06 ~ dv_pop_01 * pct_noneu_06, data = austria_authors.df,
  weights = registered_06)

repro_m1_reweight.df <- tidy_results(list(comp = results_m1_reweight)) %>%
   bind_rows(tidy_m1_original.df)

repro_m1_reweight_summary.df <- tidy_summary(list(comp = results_m1_reweight))

repro_m1_reweight.df %>% group_by(repl_id) %>% gt()
term estimate std.error statistic p.value conf.low conf.high df outcome
comp
(Intercept) 0.04348474 0.001174237 37.032332 3.047451e-237 0.04118210 0.04578738 2369 d_rr_06
dv_pop_01 0.04823643 0.012962678 3.721178 2.029103e-04 0.02281706 0.07365580 2369 d_rr_06
pct_noneu_06 -0.19640480 0.033927358 -5.788980 8.019187e-09 -0.26293519 -0.12987441 2369 d_rr_06
dv_pop_01:pct_noneu_06 0.85815267 0.159589715 5.377243 8.304824e-08 0.54520269 1.17110265 2369 d_rr_06
original
(Intercept) 0.04000000 0.000000000 NA 5.000000e-02 NA NA NA NA
dv_pop_01 0.02000000 0.030000000 NA 1.000000e+00 NA NA NA NA
pct_noneu_06 -0.02000000 0.030000000 NA 5.000000e-02 NA NA NA NA
dv_pop_01:pct_noneu_06 0.67000000 0.170000000 NA 5.000000e-02 NA NA NA NA
Code
repro_m1_reweight_summary.df %>% group_by(repl_id) %>% gt()
r.squared adj.r.squared df.residual res_var nobs
comp
0.4005089 0.3997497 2369 1.285321 2373

We see that both effects become more pronounced—first, that municipalities with higher numbers of third-country residents are substantially less likely to vote for far-right parties, and second, that municipalities with high proportions of people living in public housing are more likely to vote for far-right parties.

We can repeat the same analysis at the level of Vienna tracts for the second analysis.

Code
results_m2_reweight <- 
  estimatr::lm_robust(dv ~ pctrental + pctpublic_w_zsp, data = vienna_authors.df,
                      weights = exp(log_voters), clusters = tract_key)
Warning in eval(quote({: Some observations have missingness in the cluster
variable(s) but not in the outcome or covariates. These observations have been
dropped.
Code
repro_m2_reweight.df <- tidy_results(list(comp = results_m2_reweight)) %>%
  bind_rows(tidy_m2_original.df)

repro_m2_reweight_summary.df <- tidy_summary(list(comp = results_m2_reweight))

repro_m2_reweight.df %>% group_by(repl_id) %>% gt()
term estimate std.error statistic p.value conf.low conf.high df outcome
comp
(Intercept) 0.04417207 0.01074126 4.112374 1.397464e-04 2.261820e-02 0.06572594 52.00400 dv
pctrental 0.02914422 0.01455060 2.002957 4.950859e-02 6.459048e-05 0.05822386 62.71530 dv
pctpublic_w_zsp 0.08507290 0.01357259 6.267993 4.104067e-08 5.793362e-02 0.11221217 61.08855 dv
original
(Intercept) 0.04000000 0.01000000 NA 5.000000e-02 NA NA NA NA
pctrental 0.03000000 0.01000000 NA 5.000000e-02 NA NA NA NA
pctpublic_w_zsp 0.09000000 0.01000000 NA 5.000000e-02 NA NA NA NA
Code
repro_m2_reweight_summary.df %>% group_by(repl_id) %>% gt()
r.squared adj.r.squared df.residual res_var nobs
comp
0.1566887 0.1557406 1779 1.161713 1782

In this case, the weighting makes almost no difference to the model results.

We can also perform our alternative specification, where we adjust for the percentage of third-country residents in the Vienna analysis, with reweighting.

Code
results_m2_pctforeign_reweight <-
  estimatr::lm_robust(dv ~ pctforeign * pctpublic_w_zsp, data = vienna_authors.df,
                      weights = exp(log_voters), clusters = tract_key)

tidy_results(list(results_m2_pctforeign_reweight)) %>% gt()
repl_id term estimate std.error statistic p.value conf.low conf.high df outcome
1 (Intercept) 0.08388627 0.005397493 15.541710 3.897863e-29 0.07318567 0.09458687 106.39146 dv
1 pctforeign -0.10940114 0.027736766 -3.944264 1.880313e-04 -0.16472241 -0.05407987 69.85384 dv
1 pctpublic_w_zsp 0.04623732 0.008018155 5.766579 2.942969e-07 0.03020112 0.06227352 60.45808 dv
1 pctforeign:pctpublic_w_zsp 0.07940299 0.052729815 1.505846 1.368968e-01 -0.02588327 0.18468926 65.73932 dv
Code
tidy_summary(list(results_m2_pctforeign_reweight)) %>% gt()
repl_id r.squared adj.r.squared df.residual res_var nobs
1 0.1846439 0.1832682 1778 1.123835 1782

Again, this doesn’t seem to meaningfully affect the results.

Data Robustness Analysis

Graphical Exploration of Data (intraoccular impact)

To better understand the authors main claims, we attempt to visualize the trends they find in the raw data. We begin with the first analysis, which links support for far-right political parties in the 2006 federal election with the proportion of third-country nationals across Austrian municipalities.

Code
requireNamespace("plotly")
Loading required namespace: plotly
Code
## AUSTRIA
# Plot the relationship between % non-EU and change in vote share
suppressWarnings({
  austria_authors.df %>%
    ggplot(aes(x = pct_noneu_06, y = d_rr_06)) +
    geom_point() +
    labs(x = "% non-EU residents in municipality",
         y = "Change in far-right vote share") +
    # NOTE: There are ~200 municipalities with 0 non-EU residents
    scale_x_log10() +
    geom_smooth(method = lm)
} %>%
  plotly::ggplotly())
`geom_smooth()` using formula = 'y ~ x'
Code
# Plot the relationship between % non-EU and vote share
suppressWarnings({
  austria_authors.df %>%
    ggplot(aes(x = pct_noneu_06, y = rr_share_06)) +
    geom_point() +
    labs(x = "% non-EU residents in municipality",
         y = "Far-right vote share") +
    # NOTE: There are ~200 municipalities with 0 non-EU residents
    scale_x_log10() +
    geom_smooth(method = lm)
} %>%
  plotly::ggplotly())
`geom_smooth()` using formula = 'y ~ x'
Code
# Compare vote share in 2002 and 2006
suppressWarnings({
  austria_authors.df %>%
    select(pct_noneu_06, rr_share_06, rr_share_02) %>%
    pivot_longer(
      cols = starts_with("rr_share"),
      names_prefix = "rr_share_",
      names_to = "year",
      values_to = "vote_share"
    ) %>%
    ggplot(aes(x = pct_noneu_06, y = vote_share)) +
    geom_point() +
    labs(x = "% non-EU residents in municipality",
         y = "Far-right vote share") +
    # NOTE: There are ~200 municipalities with 0 non-EU residents
    scale_x_log10() +
    geom_smooth(method = lm) +
    facet_wrap(vars(year))
} %>%
  plotly::ggplotly())
`geom_smooth()` using formula = 'y ~ x'

Visual examination of the data indicates that (1) there is a very modest association between the percentage of third-country nationals and both the level of support and the change in the level of support for far-right parties in the 2006 elections; and (2) that this trend is more pronounced in 2006 than in 2002. One of this article’s primary interests is, in addition, the interaction between these two factors—i.e., that support is driven by competition between Austrian and third-country nationals for housing, which is most accute when public housing rates and the proportion of the population that is third-country nationals are both high. To visualize this, we plot the relationship between the percentage of third-country nationals and support for far-right parties, stratifying by the percentage of the population that lives in public housing.

Code
# Bin the percentage of the municipality in public housing and plot the
# change in vote share by % non-EU residents
cuts <-
  with(austria_authors.df,
       quantile(dv_pop_01, probs = seq(0, 1, 1 / 3),
                na.rm = TRUE))
suppressWarnings({
  austria_authors.df %>%
    mutate(pct_public_housing = cut(dv_pop_01, breaks = cuts, include.lowest = TRUE)) %>%
    # ~ 13 municipalities don't have data on public housing
    drop_na(pct_public_housing) %>%
    ggplot(aes(x = pct_noneu_06, y = d_rr_06)) +
    geom_point() +
    labs(x = "% non-EU residents in municipality",
         y = "Change in far-right vote share") +
    # NOTE: There are ~200 municipalities with 0 non-EU residents
    scale_x_log10() +
    geom_smooth(method = lm) +
    facet_wrap(vars(pct_public_housing))
} %>%
  plotly::ggplotly())
`geom_smooth()` using formula = 'y ~ x'

We see that while the trend line does appear to get steeper, it is estimated with a considerable amount of uncertainty; in particular, in all three cases, the trend is not visually apparent, and the slope of the fitted linear model has a high degree of uncertainty about sign.

We additionally color the observations based on the baseline level of support for far-right parties to see if an interaction with public housing rates and proportion of third-country nationals is apparent. In particular, it seems plausible that these relationships might be strengthened in places where the baseline support for far-right parties (as measured by their support in 2002) is already high.

Code
# Add coloring based on how far-right the municipality was in the previous
# election
suppressWarnings({
  austria_authors.df %>%
    mutate(pct_public_housing = cut(dv_pop_01, breaks = cuts, include.lowest = TRUE)) %>%
    # ~ 13 municipalities don't have data on public housing
    drop_na(pct_public_housing) %>%
    ggplot(aes(x = pct_noneu_06, y = d_rr_06, color = rr_share_02)) +
    geom_point() +
    labs(x = "% non-EU residents in municipality",
         y = "Change in far-right vote share",
         color = "% far-right in\nprevious election") +
    # NOTE: There are ~200 municipalities with 0 non-EU residents
    scale_x_log10() +
    scale_color_gradient(low = "blue", high = "red") +
    geom_smooth(method = lm) +
    facet_wrap(vars(pct_public_housing))
} %>%
  plotly::ggplotly())
`geom_smooth()` using formula = 'y ~ x'

While some extreme outliers seem to be in places that were already far-right in 2002, no trend is immediately apparent.

Next, we consider the second analysis, which tries to more directly test whether direct competition for public housing between Austrian and third-country nationals explains increasing support for far-right parties in Vienna.

We begin by visualizing the relationship between the proportion of third-country nationals living in a tract, the proportion of residents living in public housing in a tract, and change in far-right vote share.

Code
## VIENNA
# Plot the relationship between % non-EU and change in vote share
suppressWarnings({
  vienna_authors.df %>%
    ggplot(aes(x = pctforeign, y = dv)) +
    geom_point() +
    labs(x = "% non-EU residents in tract",
         y = "Change in far-right vote share") +
    # NOTE: There are ~15 tracts with no foreign residents
    scale_x_log10() +
    geom_smooth(method = lm)
} %>%
  plotly::ggplotly() 
)
`geom_smooth()` using formula = 'y ~ x'
Code
# Plot the relationship between % in public housing and change in vote share
suppressWarnings({
  vienna_authors.df %>%
    ggplot(aes(x = pctpublic_w_zsp, y = dv)) +
    geom_point() +
    labs(x = "% of residents in public housing",
         y = "Change in far-right vote share") +
    # NOTE: There are ~900 tracts with no one in public housing
    scale_x_log10() +
    geom_smooth(method = lm)
  } %>%
  plotly::ggplotly())
`geom_smooth()` using formula = 'y ~ x'

We notice two important facts. First, the relationship between the percentage of individuals living in public housing and the change in far-right vote share is much more pronounced and positive in Vienna than in the national data. Second, the relationship between the percentage of third-country nationals and the far-right vote share is actually very pronounced and negative. To understand the interrelationship between these two factors, we stratify by the proportion of individuals living in public housing and plot the relationship between the percentage of third-country nationals and change in far-right vote share, and see that while the rate of support does seem to rise in the highest bin (i.e., those tracts where the largest number of people live in public housing), the trend within each bin remains fairly negative.

Code
# Plot the relationship between % non-EU and change in vote share, stratified by
# the rate of public housing
cuts <-
  with(vienna_authors.df,
       quantile(
         pctpublic_w_zsp,
         probs = c(0, 1 / 2, 3 / 4, 1),
         na.rm = TRUE
       ))

suppressWarnings({
  vienna_authors.df %>%
    mutate(public_housing = cut(pctpublic_w_zsp, cuts, include.lowest = TRUE)) %>%
    ggplot(aes(x = pctforeign, y = dv)) +
    geom_point() +
    labs(x = "% non-EU residents in tract",
         y = "Change in far-right vote share") +
    # NOTE: There are ~15 tracts with no foreign residents
    scale_x_log10() +
    geom_smooth(method = lm) +
    facet_wrap(vars(public_housing))
} %>%
    plotly::ggplotly())
`geom_smooth()` using formula = 'y ~ x'
Code
rm(cuts)

Outlier analysis

First of all, we can have a look at the outliers in the main variables from Table 1 (Austrian sample). The following code visualizes the relationships between the main variables of interest: the percentage of non-EU residents, the percentage of people living in public housing, and the change in far-right vote share. From each of these variables, we drop 1% of the largest and smallest values. And then the plots compare the change in linear dependence between the variables.

Code
library(patchwork)

###### OUTLIERS PLOTS

p1 <- austria_authors.df %>%
  mutate(outlier = ifelse(
    (
      dv_pop_01 > quantile(dv_pop_01, 0.99, na.rm = T) |
        d_rr_06 > quantile(d_rr_06, 0.99, na.rm = T) |
        dv_pop_01 < quantile(dv_pop_01, 0.01, na.rm = T) |
        d_rr_06 < quantile(d_rr_06, 0.01, na.rm = T)
    ),
    T,
    F
  )) %>%
  ggplot(aes(x = dv_pop_01, y = d_rr_06)) +
  geom_point(aes(color = outlier), size = 1) +
  scale_color_manual(values = c('navy', 'red')) +
  ggtitle('') +
  ylab('Δ  2002–6') + xlab('% public housing') +
  geom_smooth(method = "lm",
              se = T,
              color = "red") +
  geom_smooth(
    data = . %>% filter(outlier == F),
    method = "lm",
    se = T,
    color = "blue"
  )  +
  geom_text(
    data = . %>%
      summarise(r2 = summary(lm(d_rr_06 ~ dv_pop_01))$r.squared),
    aes(
      label = paste("R^2 =", round(r2, 3)),
      x = 0.4,
      y = 0.2
    ),
    color = "red",
    hjust = 0
  ) +
  geom_text(
    data = . %>% filter(outlier == 0) %>%
      summarise(r2 = summary(lm(d_rr_06 ~ dv_pop_01))$r.squared),
    aes(
      label = paste("R^2 =", round(r2, 3)),
      x = 0.4,
      y = 0.18
    ),
    color = "blue",
    hjust = 0
  )

p2 <- austria_authors.df %>%
  mutate(outlier = ifelse(
    (
      pct_noneu_06 > quantile(pct_noneu_06, 0.99, na.rm = T) |
        d_rr_06 > quantile(d_rr_06, 0.99, na.rm = T) |
        pct_noneu_06 < quantile(pct_noneu_06, 0.01, na.rm = T) |
        d_rr_06 < quantile(d_rr_06, 0.01, na.rm = T)
    ),
    T,
    F
  )) %>%
  ggplot(aes(x = pct_noneu_06, y = d_rr_06)) +
  geom_point(aes(color = outlier), size = 1) +
  scale_color_manual(values = c('navy', 'red')) +
  ggtitle('') +
  ylab('Δ  2002–6') + xlab('% non-EU') +
  geom_smooth(method = "lm",
              se = T,
              color = "red") +
  geom_smooth(
    data = . %>% filter(outlier == F),
    method = "lm",
    se = T,
    color = "blue"
  )  +
  geom_text(
    data = . %>%
      summarise(r2 = summary(lm(
        d_rr_06 ~ pct_noneu_06
      ))$r.squared),
    aes(
      label = paste("R^2 =", round(r2, 3)),
      x = 0.15,
      y = 0.2
    ),
    color = "red",
    hjust = 0
  ) +
  geom_text(
    data = . %>% filter(outlier == 0) %>%
      summarise(r2 = summary(lm(
        d_rr_06 ~ pct_noneu_06
      ))$r.squared),
    aes(
      label = paste("R^2 =", round(r2, 3)),
      x = 0.15,
      y = 0.18
    ),
    color = "blue",
    hjust = 0
  )

p3 <- austria_authors.df %>%
  mutate(outlier = ifelse(
    (
      dv_pop_01 > quantile(dv_pop_01, 0.99, na.rm = TRUE) |
        d_rr_06 > quantile(d_rr_06, 0.99, na.rm = TRUE) |
        dv_pop_01 < quantile(dv_pop_01, 0.01, na.rm = TRUE) |
        d_rr_06 < quantile(d_rr_06, 0.01, na.rm = TRUE) |
        pct_noneu_06 > quantile(pct_noneu_06, 0.99, na.rm = TRUE) |
        d_rr_06 > quantile(d_rr_06, 0.99, na.rm = TRUE) |
        pct_noneu_06 < quantile(pct_noneu_06, 0.01, na.rm = TRUE) |
        d_rr_06 < quantile(d_rr_06, 0.01, na.rm = TRUE)
    ),
    TRUE,
    FALSE
  )) %>%
  ggplot(aes(x = I(pct_noneu_06 * pct_noneu_06), y = d_rr_06)) +
  geom_point(aes(color = outlier), size = 1) +
  scale_color_manual(values = c('navy', 'red')) +
  ggtitle('') +  # Add your plot title
  geom_smooth(method = "lm",
              se = T,
              color = "red") + # Add smoothed line for the whole sample
  geom_smooth(
    data = . %>% filter(outlier == F),
    method = "lm",
    se = T,
    color = "blue"
  )  +
  geom_text(
    data = . %>%
      summarise(r2 = summary(lm(
        d_rr_06 ~ I(pct_noneu_06 * pct_noneu_06)
      ))$r.squared),
    aes(
      label = paste("R^2 =", round(r2, 3)),
      x = 0.04,
      y = 0.2
    ),
    color = "red",
    hjust = 0
  ) +
  geom_text(
    data = . %>% filter(outlier == 0) %>%
      summarise(r2 = summary(lm(
        d_rr_06 ~ I(pct_noneu_06 * pct_noneu_06)
      ))$r.squared),
    aes(
      label = paste("R^2 =", round(r2, 3)),
      x = 0.04,
      y = 0.18
    ),
    color = "blue",
    hjust = 0
  )

suppressWarnings(print({p1+p2+p3}))
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'

Code
rm(p1,p2,p3)

From these plots, we can observe that after removing outliers, the R-squared values decrease significantly. Respectively, it drops from 0.028 to 0.029 (a 32% decrease), from 0.013 to 0.010 (a 23% decrease), and from 0.008 to 0.003 for the key interaction (a 62.5% decrease).

Now, we replicate the baseline model from Table 1 (regression of far-right vote change on public housing, non-EU and residents and their interaction).

Code
library("lfe")
Loading required package: Matrix

Attaching package: 'Matrix'
The following objects are masked from 'package:tidyr':

    expand, pack, unpack
Code
library("gtsummary")

austria_authors.df_upd <- austria_authors.df %>%
  mutate(
    outlier98 = ifelse(
      (dv_pop_01 > quantile(dv_pop_01, 0.99, na.rm = TRUE) | 
         d_rr_06 > quantile(d_rr_06, 0.99, na.rm = TRUE) |
         dv_pop_01 < quantile(dv_pop_01, 0.01, na.rm = TRUE) | 
         d_rr_06 < quantile(d_rr_06, 0.01, na.rm = TRUE) |
         pct_noneu_06 > quantile(pct_noneu_06, 0.99, na.rm = TRUE) | 
         d_rr_06 > quantile(d_rr_06, 0.99, na.rm = TRUE) |
         pct_noneu_06 < quantile(pct_noneu_06, 0.01, na.rm = TRUE) | 
         d_rr_06 < quantile(d_rr_06, 0.01, na.rm = TRUE)),
      TRUE, FALSE),
      outlier98_up = ifelse(
        (dv_pop_01 > quantile(dv_pop_01, 0.99, na.rm = TRUE) | 
           pct_noneu_06 > quantile(pct_noneu_06, 0.99, na.rm = TRUE) | 
           d_rr_06 > quantile(d_rr_06, 0.99, na.rm = TRUE)), 
        TRUE, FALSE),
        outlier98_down = ifelse(
      (dv_pop_01 < quantile(dv_pop_01, 0.01, na.rm = TRUE) | 
         pct_noneu_06 < quantile(pct_noneu_06, 0.01, na.rm = TRUE)|
         d_rr_06 < quantile(d_rr_06, 0.01, na.rm = TRUE)), 
      TRUE, FALSE)
    )



ols1 <- felm(d_rr_06 ~ dv_pop_01*pct_noneu_06|0|0, data =austria_authors.df) %>%
  tbl_regression(tidy_fun = purrr::partial(tidy_robust, robust = "HC1"))%>%        
  add_significance_stars(hide_p = F, hide_se = F)%>%
  gtsummary::add_glance_table(include = c(nobs, r.squared))
Arguments `vcov` and `vcov_args` have not been specified in `tidy_robust()`.
Specify at least one to obtain robust standard errors.
tidy_robust(): Robust estimation with
`parameters::model_parameters(model = x, ci = 0.95, robust = "HC1")`
Code
ols2 <- felm(d_rr_06 ~ dv_pop_01*pct_noneu_06|0|0, data =subset(austria_authors.df_upd, outlier98 == FALSE)) %>%
  tbl_regression()%>%        
  add_significance_stars(hide_p = F, hide_se = F)%>%
  add_glance_table(include = c(nobs, r.squared))       
ols3 <- felm(d_rr_06 ~ dv_pop_01*pct_noneu_06|0|0, data =subset(austria_authors.df_upd, outlier98_up == FALSE)) %>%
  tbl_regression()%>%        
  add_significance_stars(hide_p = F, hide_se = F)%>%
  add_glance_table(include = c(nobs, r.squared))
ols4 <- felm(d_rr_06 ~ dv_pop_01*pct_noneu_06|0|0, data =subset(austria_authors.df_upd, outlier98_down == FALSE)) %>%
  tbl_regression()%>%        
  add_significance_stars(hide_p = F, hide_se = F)%>%
  add_glance_table(include = c(nobs, r.squared))


tbl_merge_ex1 <-
  tbl_merge(
    tbls = list(ols1, ols2, ols3, ols4),
    tab_spanner = c("**Original Specification**", "1% Min-Max Dropped", "1% Max Dropped", "1% Min Dropped")
  )

tbl_merge_ex1
Characteristic Original Specification 1% Min-Max Dropped 1% Max Dropped 1% Min Dropped
Beta1 SE2 p-value Beta1 SE2 p-value Beta1 SE2 p-value Beta1 SE2 p-value
dv_pop_01 0.02* 0.012 0.045 0.02 0.014 0.090 0.03 0.015 0.070 0.02 0.012 0.069
pct_noneu_06 -0.02 0.030 0.5 0.02 0.032 0.6 0.02 0.033 0.6 -0.02 0.029 0.4
dv_pop_01 * pct_noneu_06 0.67*** 0.183 <0.001 0.47 0.280 0.091 0.50 0.293 0.091 0.67*** 0.176 <0.001
No. Obs. 2,373

2,284

2,308

2,349

0.036

0.018

0.019

0.036

1 *p<0.05; **p<0.01; ***p<0.001
2 SE = Standard Error

We can see that the initial baseline model is reproducible in terms of effect sizes and standard errors. But once we change subsamples, results change as well. We can see that the exclusion of 1% highest values for all three key variables is related to the change in the estimate for the interaction term: coefficient drops to 0.47 while p-value = 0.091 compared to the p-value <0.001 in the baseline model.

Next, we try to cluster errors by district (bezirk). For the initial sample, the result does not change drastically (the p-value for the interaction term increases to 0.006). Yet, the significance of the term when the highest values are dropped reach even higher p = 0.2).

Code
rols1 <- felm(d_rr_06 ~ dv_pop_01*pct_noneu_06|0|0|bezirk, data =austria_authors.df) %>%
  tbl_regression(tidy_fun = purrr::partial(tidy_robust, robust = "HC1"))%>%        
  add_significance_stars(hide_p = F, hide_se = F)%>%
  add_glance_table(include = c(nobs, r.squared))
Arguments `vcov` and `vcov_args` have not been specified in `tidy_robust()`.
Specify at least one to obtain robust standard errors.
tidy_robust(): Robust estimation with
`parameters::model_parameters(model = x, ci = 0.95, robust = "HC1")`
Code
rols2 <- felm(d_rr_06 ~ dv_pop_01*pct_noneu_06|0|0|bezirk, data =subset(austria_authors.df_upd, outlier98 == FALSE)) %>%
  tbl_regression()%>%        
  add_significance_stars(hide_p = F, hide_se = F)%>%
  add_glance_table(include = c(nobs, r.squared))       
rols3 <- felm(d_rr_06 ~ dv_pop_01*pct_noneu_06|0|0|bezirk, data =subset(austria_authors.df_upd, outlier98_up == FALSE)) %>%
  tbl_regression()%>%        
  add_significance_stars(hide_p = F, hide_se = F)%>%
  add_glance_table(include = c(nobs, r.squared))
rols4 <- felm(d_rr_06 ~ dv_pop_01*pct_noneu_06|0|0|bezirk, data =subset(austria_authors.df_upd, outlier98_down == FALSE)) %>%
  tbl_regression()%>%        
  add_significance_stars(hide_p = F, hide_se = F)%>%
  add_glance_table(include = c(nobs, r.squared))


tbl_merge_ex1 <-
  tbl_merge(
    tbls = list(rols1, rols2, rols3, rols4),
    tab_spanner = c("**Original Specification**", "1% Min-Max Dropped", "1% Max Dropped", "1% Min Dropped")
  )

tbl_merge_ex1
Characteristic Original Specification 1% Min-Max Dropped 1% Max Dropped 1% Min Dropped
Beta1 SE2 p-value Beta1 SE2 p-value Beta1 SE2 p-value Beta1 SE2 p-value
dv_pop_01 0.02 0.024 0.3 0.02 0.028 0.4 0.03 0.029 0.4 0.02 0.023 0.4
pct_noneu_06 -0.02 0.060 0.7 0.02 0.064 0.8 0.02 0.065 0.8 -0.02 0.059 0.7
dv_pop_01 * pct_noneu_06 0.67** 0.242 0.006 0.47 0.373 0.2 0.50 0.374 0.2 0.67** 0.244 0.006
No. Obs. 2,373

2,284

2,308

2,349

0.036

0.018

0.019

0.036

1 *p<0.05; **p<0.01; ***p<0.001
2 SE = Standard Error

Here we look at the covariate balance between outliers and non-outlier observations.

Code
library("cobalt")
 cobalt (Version 4.5.1, Build Date: 2023-04-27)
Code
library("MatchIt")

Attaching package: 'MatchIt'
The following object is masked from 'package:cobalt':

    lalonde
Code
austria_authors.df_upd1 <- subset(austria_authors.df_upd, outlier98_up == TRUE | outlier98_up == FALSE)



austria_authors.df_upd1 <- austria_authors.df_upd1 %>% filter(!is.na(educ_tertiary) & !is.na(avg_income) &
                                !is.na(lab_pct_manufact_01) & !is.na(lab_pct_unemp) &
                                !is.na(welfare_cap_06 ) & !is.na( health_cap_06) &
                                !is.na(education_cap_06) & !is.na(foreignborn_delta))

set.seed(7)
m.out <- MatchIt::matchit(outlier98_up ~ educ_tertiary + registered_06+
                            avg_income + 
                            lab_pct_manufact_01 + 
                            lab_pct_unemp + welfare_cap_06 +  
                            health_cap_06 + education_cap_06 + 
                            foreignborn_delta, data = austria_authors.df_upd1)

bal.tab(m.out, thresholds = c(m = .1), un = TRUE)
Balance Measures
                        Type Diff.Un Diff.Adj        M.Threshold
distance            Distance  0.6520   0.3392                   
educ_tertiary        Contin.  0.4443   0.1043 Not Balanced, >0.1
registered_06        Contin.  0.5805   0.3800 Not Balanced, >0.1
avg_income           Contin.  0.2420   0.0743     Balanced, <0.1
lab_pct_manufact_01  Contin. -0.0888  -0.1974 Not Balanced, >0.1
lab_pct_unemp        Contin.  0.8857   0.1788 Not Balanced, >0.1
welfare_cap_06       Contin.  0.6576   0.2847 Not Balanced, >0.1
health_cap_06        Contin.  0.5589   0.3480 Not Balanced, >0.1
education_cap_06     Contin.  0.6096   0.2370 Not Balanced, >0.1
foreignborn_delta    Contin.  0.6280   0.2128 Not Balanced, >0.1

Balance tally for mean differences
                   count
Balanced, <0.1         1
Not Balanced, >0.1     8

Variable with the greatest mean difference
      Variable Diff.Adj        M.Threshold
 registered_06     0.38 Not Balanced, >0.1

Sample sizes
          Control Treated
All          2304      65
Matched        65      65
Unmatched    2239       0
Code
love.plot(m.out, stats = c("mean.diffs"),
          thresholds = c(m = .1, v = 2), abs = TRUE, 
          binary = "std",
          var.order = "unadjusted")

And we replicate the authors main specification on the sample of outliers:

Code
out1 <- felm(d_rr_06 ~ dv_pop_01*pct_noneu_06|0|0|bezirk, data =subset(austria_authors.df_upd, outlier98_up == TRUE)) %>%
  tbl_regression()%>%        
  add_significance_stars(hide_p = F, hide_se = F)%>%
  add_glance_table(include = c(nobs, r.squared))

tbl_merge_ex2 <-
  tbl_merge(
    tbls = list(out1),
    tab_spanner = c("")
  )
tbl_merge_ex2 
Characteristic
Beta1 SE2 p-value
dv_pop_01 -0.26*** 0.035 <0.001
pct_noneu_06 -0.98*** 0.069 <0.001
dv_pop_01 * pct_noneu_06 3.0*** 0.332 <0.001
No. Obs. 65

0.687

1 *p<0.05; **p<0.01; ***p<0.001
2 SE = Standard Error

Model fit

Code
sqe1 <- double(nrow(austria_authors.df))
for (i in seq(nrow(austria_authors.df))) {
  row <- austria_authors.df[i, ]
  m <- lm(d_rr_06 ~ dv_pop_01 * pct_noneu_06, data = austria_authors.df[-i, ])
  sqe1[[i]] <- (row$d_rr_06 - predict(m, row))^2
}
print(glue::glue("LOO RMSE for interaction mode: { sqrt(mean(sqe1, na.rm = TRUE)) }"))
LOO RMSE for interaction mode: 0.0283408162629289
Code
sqe2 <- double(nrow(austria_authors.df))
for (i in seq(nrow(austria_authors.df))) {
  row <- austria_authors.df[i, ]
  m <- lm(d_rr_06 ~ dv_pop_01 + pct_noneu_06, data = austria_authors.df[-i, ])
  sqe2[[i]] <- (row$d_rr_06 - predict(m, row))^2
}
print(glue::glue("LOO RMSE for main effects: { sqrt(mean(sqe2, na.rm = TRUE)) }"))
LOO RMSE for main effects: 0.0284132384077463
Code
rm(sqe1)
rm(sqe2)
Code
sqe1 <- double(nrow(vienna_authors.df))
for (i in seq(nrow(vienna_authors.df))) {
  row <- vienna_authors.df[i, ]
  m <- lm(dv ~ pctrental * pctpublic_w_zsp, data = vienna_authors.df[-i, ])
  sqe1[[i]] <- (row$dv - predict(m, row))^2
}
print(glue::glue("LOO RMSE for interaction mode: { sqrt(mean(sqe1, na.rm = TRUE)) }"))
LOO RMSE for interaction mode: 0.0431238824470698
Code
sqe2 <- double(nrow(vienna_authors.df))
for (i in seq(nrow(vienna_authors.df))) {
  row <- vienna_authors.df[i, ]
  m <- lm(dv ~ pctrental + pctpublic_w_zsp, data = vienna_authors.df[-i, ])
  sqe2[[i]] <- (row$dv - predict(m, row))^2
}
print(glue::glue("LOO RMSE for main effects: { sqrt(mean(sqe2, na.rm = TRUE)) }"))
LOO RMSE for main effects: 0.0439284411572522
Code
rm(sqe1)
rm(sqe2)

Conclusion

We find research is computationally replicable but the evidence for the main causal claims are overstated. While the focus of analysis concentrates on patterns in outcomes that are likely relevant to understanding the underlying data-generating process, the statistical models supporting the causal claims support only a small proportion of overall variance in outcomes. Furthermore, the analysis elides relevant competing models. For example, model-only main effects fit the data nearly as well as those including the interactive term that supports the main causal claim.

The relative weakness of the claim is obscured in the public analysis because neither overall goodness of fit, nor comparison to ‘naive’ / baseline models are include. While conceptual reproducibility analysis would be useful for exploring more complex alternate models this route is obstructed by the absence of citations, documentation and linking codes that would support reliable reanalysis using original data, or augmentation of the authors’ data with additional measures. We conjecture that, as a general practice, research reliability would be increased by including these practices in publication and data sharing.

References

Altman, Micah, Jeff Gill, and Michael McDonald. 2004. Numerical issues in statistical computing for the social scientist. Wiley series in probability and statistics. Hoboken, NJ: Wiley-Interscience.
Cavaille, Charlotte. 2023. “Replication Data for: How Distributional Conflict over in-Kind Benefits Generates Support for Far-Right Parties.” Harvard Dataverse. https://doi.org/10.7910/DVN/SYNP73.
Cavaillé, Charlotte, and Jeremy Ferwerda. 2023. “How Distributional Conflict over In-Kind Benefits Generates Support for Far-Right Parties.” The Journal of Politics 85 (1): 19–33. https://doi.org/10.1086/720643.

Footnotes

  1. Corresponding author.↩︎

  2. All authors declare that they have no financial support or conflict of interest in this publication.↩︎